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Abstract 

Recently, the secrecy capacity of the multi-antenna wiretap channel was characterized by 
Khisti and Wornell [Tj using a Sato-like argument. This note presents an alternative charac- 
terization using a channel enhancement argument. This characterization relies on an extremal 
entropy inequality recently proved in the context of multi-antenna broadcast channels, and is 
directly built on the physical intuition regarding to the optimal transmission strategy in this 
communication scenario. 



1 Introduction 



Consider a multi-antenna wiretap channel with rit transmit antennas and rir and Ug receive antennas 
at the legitimate receiver and the eavesdropper, respectively: 



y^m] = Hrx[m]+Wr[' ^ 
ye[m] = Hex[m]+We[m] 

where G M"-"-^"' and He G ]R"=^"-* are the channel matrices associated with the legitimate receiver 
and the eavesdropper. The channel matrices and He are assumed to be fixed during the entire 
transmission and are known to all three terminals. The additive noise w^fm] and We[m] are white 
Gaussian vectors with zero mean and are independent across the time index m. The channel input 
satisfies a total power constraint 

1 " 

-^||x[m]f <F. (2) 
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The secrecy capacity is defined as the maximum rate of communication such that the information 
can be decoded arbitrarily rehably at the legitimate receiver but not at the eavesdropper. 



For a discrete memoryless wiretap channel P(Yr, Ye\X), a single-letter expression for the secrecy 
capacity was obtained by Csiszar and Korner [2] and can be written as 

C = max [/([/; - /(f/; Y,)] (3) 

where U is an auxiliary random variable over a certain alphabet that satisfies the Markov relation 
U — X — {Yr,Ye). Moreover, ([3]) extends to continuous alphabet cases with power constraint, so 
the problem of characterizing the secrecy capacity of the multi-antenna wiretap channel reduces to 
evaluating ([3]) for the specific channel model ([T]) . 

Note that evaluating ([3]) involves solving a functional, nonconvex optimization problem. Solving 
optimization problems of this type usually requires nontrivial techniques and strong inequalities. 
Indeed, for the single-antenna case (n^ = rir = Ue = I) , the capacity expression ([3]) was successfully 
evaluated by Leung and Hellman [3] using a result of Wyner [1] on the degraded wiretap channel 
and the celebrated entropy-power inequality [5l Cha. 16.7]. (Alternatively, it can also be evaluated 
using a classical result from estimation theory via a relationship between mutual information and 
minimum mean-squared error estimation [6].) Unfortunately, the same approach does not extend 
to the multi-antenna the latter, in its general form, belongs to the class of nondegraded 

wiretap channels. The problem of characterizing the secrecy capacity of the multi-antenna wiretap 
channel remained open until the recent work of Khisti and Wornell [1] . 

In [1], Khisti and Wornell followed an indirect approach to evaluate the capacity expression ([3]) 
for the multi-antenna wiretap channel. Key to their evaluation is the following genie-aided upper 
bound 

I{U;Yr) - I{U;Y,) < I{U;Yr,Y,) - I{U;Y,) (4) 
= I{X;Yr,Y,)-I{X;Y,)-[I{X;Yr,Y,\U)-IiX;Y,\U)] (5) 
< I{X;Yr,Ye) - I{X;Y,) (6) 
= IiX;Yr\Y,) (7) 

where (|5]) follows from the Markov chain U — X — {Yr, Y^), and ([6]) follows from the trivial inequality 
I{X]Yr,Ye\U) > I{X]Ye\U). Khisti and Wornell [1] further noticed that the original objective of 
optimization I{U;Yr) — I{U;Ye) depends on the channel transition probability P{Yr,Ye\X) only 
through the marginals P(Yr\X) and P(Ye\X), whereas the upper bound I{X]Yr\Ye) does depend 
on the joint conditional ygl-^)- A good upper bound on the secrecy capacity is thus contrived 
as 

C = max [I(U; F,) - I(U; Y^)] < min max /(X; Y'\Y') = max min /(X; Y!\Y') (8) 

p{u,x) p{y;,yi\x)£V p{x) p{x) p{Y;,Yi\x)ev 
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where P is a set of joint conditionals P(Y^, Y^\X) satisfying 

P{Y;\X) = P{Yr\X) and P{Y:\X) = P{Y,\X). (9) 

The upper bound mmp(Y^y^\x)£V^^p{x) H-^iYr\Y^) has a specific physical meaning: it is the 
secrecy capacity of the wiretap channel P(Y^, Y^\X) where the legitimate user has access to both 1^ 
and Ye, minimized over the worst cooperation between the legitimate receiver and the eavesdropper. 
In essence, this is very similar to the Sato upper bound on the sum capacity of a general broad- 
cast channel [7j. For the multi-antenna wiretap channel, Khisti and Wornell [1] showed that the 
conditional mutual information I{X]Y^\Y^) is maximized when the channel input X is Gaussian. 
Hence, the upper bounds in ([8]) can be written as a saddle-point matrix optimization problem. 
By comparing the value of the optimal Gaussian solution for the original optimization problem 
ma.xp(^u^x)[I{UjYr) — I{U;Ye)] with the upper bounds in (IHD, Khisti and Wornell P showed that 
the results are identical and thus established the optimality of both matrix characterizations for 
the multi-antenna wiretap channel. Operationally, Khisti and Wornell [1] showed that the origi- 
nal multi-antenna wiretap channel has the same secrecy capacity as when the legitimate user has 
access to both received signals and optimized over the worst cooperation between the legitimate 
user and the eavesdropper. (The same approach was also followed by Shafiee et al. [S] and Oggier 
and Hassibi [S] to characterize the secrecy capacity of the 2x2x1 and the general multi-antenna 
wiretap channel, respectively.) Considering the disparity between these two physical scenarios, this 
is a rather surprising result. 

The approach of Khisti and Wornell [1] also reminds us of the degraded same marginals bound 
for the capacity region of the multi-antenna broadcast channel [THIIII]. There, the optimality of the 
Gaussian input is hard to come by, and a precise characterization of the capacity region had to wait 
until the proposal of a drastically different approach by Weingarten et al. [12]. Motivated by the 
line of work on the multi-antenna broadcast channel, in this note we present a different approach to 
characterize the secrecy capacity of the multi-antenna wiretap channel. Our approach is based on 
an extremal entropy inequality recently proved in the context of multi-antenna broadcast channels 
[l3l[Tlj, and is directly built on the physical intuition regarding to the optimal transmission strategy 
in this communication scenario. 
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2 Capacity Characterization via a Channel Enhancement 
Argument 



2.1 Capacity characterization 

We consider a canonical version of the channel (vector Gaussian wiretap channel) 

yr[m] = x[m] + w^[m] 
ye[m] = x[m] + We[m], 

where x[m] is a real input vector of length t, and Wr[m] and We[m] are additive Gaussian noise 
vectors with zero mean and covariance matrix K,. and Kg respectively and are independent across 
the time index m. The noise covariance matrices and Kg are assumed to be positive definite. 
The channel input satisfies a power-covariance constraint 



1 

- ^x[m]x*[m] ^ S (11) 



n 

171=1 



where S is a positive definite matrix of size t x t, and represents "less or equal to" in the 
positive semidefinite partial ordering between real symmetric matrices. Note that ffTTl) is a rather 
general constraint that subsumes many other constraints including the total power constraint ([2]). 
Following [121 Sec. 5], it can be shown that for any channel gain matrices and Hg, there exists 
a sequence of vector Gaussian wiretap channels flTU]) whose capacities approach that of the multi- 
antenna wiretap channel ([T]). Without loss of generality, we shall focus on the vector Gaussian 
wiretap channel (fTOj) with power-covariance constraint (fTT]) for the rest of the note. 

We first present a matrix characterization for the secrecy capacity of a degraded vector Gaussian 
wiretap channel. 



Theorem 1: If there exists a positive semidefinite matrix K* ^ S such that 

(12) 



[Kl + Kr)-' = (K: + Ke)-i + M2 



(S-K*)M2 = 

for some positive semidefinite matrix M2, the secrecy capacity of a degraded vector Gaussian wiretap 
channel (fTOl) with K,. ^ Kg can be written as 

C = ^ log det (I + K:.K;^) - i log det (l + KIK^') . (13) 



Theorem 1 states that if there exists a positive semidefinite matrix K* ^ S that satisfies (fT2|) . 
then f/ = X ~ A/'(0, K*) is an optimal choice for the capacity expression ([3]) of a degraded vector 
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Gaussian wiretap channel. Note that this provided a sufficient condition to evaluate optimality for 
a specific choice of (f/, X). To put in perspective, proving the optimality of Gaussian U = X for 
the degraded vector Gaussian wiretap channel can be done with relative ease using, for example, 
the worst additive noise result of Diggavi and Cover [15]. However, even within the Gaussians, it 
is not clear how one could obtain a sufficient condition for the optimal choice of the covariance 
matrix, as the matrix optimization problem is (once again) a nonconvex one and the standard 
Karush-Kuhn- Tucker (KKT) condition is (a priori) only a necessary condition. 

Proof of Theorem 1: For a degraded wiretap channel P{Yr,Ye\X), Wyner [4] showed that the 
secrecy capacity is given by 

max [I{X- Yr) - I{X; ¥,)] . (14) 

It thus follows that the secrecy capacity of a degraded vector Gaussian wiretap channel ffTOj) with 
Kr :< Kg can be written as 

C = max [/(X;X + W,) - /(X;X + We)] (15) 

/(X): E[XX«]^S 

max [/i(X + W,) - /i(X + We)] - ( ^logdetK, - JlogdetKe ) . (16) 



/(X): ^B^XX'l^S \2 2 

where W^ and We are length-t Gaussian vectors with zero mean and covariance matrix and Kg 
respectively and are independent of X. As a special case of Lemma 2 in [13] , we have 

max [/i(X + W,) -/i(X + We)] < ^logdet(K: + K,) - ^logdet(K: + Ke). (17) 
/(X): £;[xx*]^s 2 2 

(Inequality (fT7|) was also implicitly used in [T3l Appendix C] . For completeness, a proof is included 
in Appendix A.) Substituting (IT7|) into (ITB]) . we obtained the desired result ([13]). This completes 
the proof. ■ 

Next, we use a channel enhancement argument to lift the result of Theorem 1 to the general vec- 
tor Gaussian wiretap channel. Channel enhancement argument was first introduced by Weingarten 
et al. [12] to characterize the capacity region of the multi-antenna broadcast channel. Here, adap- 
tations are made to fit our purposes. The difference between the channel enhancement argument 
here and that of Weingarten et al. [12] will be explained at the end of Sec. 2.2. 

Theorem 2: The secrecy capacity of a general vector Gaussian wiretap channel (fTOj) can be written 
as 

„ n, , 1 

G = max 

0-<K;r-<S 



- log det (I + K,K;') - - log det (l + K^K^') 



where an optimal K^. (denoted here as K*) must satisfy 

(K* + K,)-i + Ml = (K*+Ke)"^ + M2 

K^Mi = (19) 
(S-K*)M2 = 
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for some positive semidefinite matrices Mi and M2. 



Note that unlike Theorem 1, the characterization f|T9|) for the optimal covariance matrix K^, is 
based on the standard KKT condition and hence is only a necessary condition. 

Proof of Theorem 2: Let K* be an optimal solution to the optimization problem in ( |T8l) . By the 
KKT condition, K* must satisfy the equations in f|T9|l . Recall the single-letter capacity expression 
([3]) and let U = X ~ A/'(0, K*). The secrecy capacity of a general vector Gaussian wiretap channel 
( fTOj) can be bounded from below as 

C>^\og det (I + K:K;') - 1 log det (l + K^K^i) . (20) 

To prove the reverse inequality, consider a new vector Gaussian wiretap channel with legitimate 
receiver and eavesdropper noise covariance matrix being and Ke respectively, where is defined 
through the equation 

(k: + Krr' = (k: + k,)"^ + mi. (21) 

Following Lemmas 10 and 11 of [12], has the following important properties: 

1. ^ ^ {K^,Ke}; 

2. det(I + K*K-i) = det(I + K*K-i). 



By virtue of ^ Ke, the new vector Gaussian wiretap channel is a degraded one. Furthermore, 
by the first and third equation in f[T^ and (^T^ we have 

(K*+K,)-i = (K: + K,)-i + M2 . . 

(S-K:)M2 = 0. ^ ^ 

It thus follows from Theorem 1 that the secrecy capacity of this new channel is equal to 

C = ^logdet(l + K:K;i) -^logdet(l + K:K;i) (23) 

= ^ log det (I + K:K;') - i log det (l + KIK~') (24) 

where the last equality is due to the second property of K^- Note from the first property of 
that Kr ^ K^. Reducing the noise covariance matrix for the legitimate receiver can only increase 
the secrecy capacity, so we have 

C < (7 = ^ log det (I + K;K;') - ^ log det (l + K^K;^) (25) 

which is the desired reverse inequality. Putting together (j20|) and fl25|) completes the proof of the 
theorem. ■ 
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2.2 Physical intuition 



Our approach of characterizing the secrecy capacity of the vector Gaussian wiretap channel hinges 
on the existence of an enhanced channel, which needs to satisfy: 

1. it is degraded, so the secrecy capacity can be readily characterized; 

2. it has the same secrecy capacity as the original wiretap channel. 

A priori, it is not clear whether such an enhanced channel would always exist, letting alone to 
actually construct one. 

Our intuition regarding to the existence of the enhanced channel was mainly from the parallel 
Gaussian wiretap channel, which is a special case of the vector Gaussian wiretap channel ( fTOl) with 
diagonal noise covariance matrices and Kg. In this case, it is shown in [16] that the optimal 
transmission strategy is to transmit only to the subchannels for which the received signal by the 
legitimate receiver is stronger than that by the eavesdropper. Therefore, an enhanced channel can 
be constructed by reducing the noise variance for the legitimate receiver in each of those subchannels 
to the noise variance level of the eavesdropper. Clearly, the enhanced channel thus constructed is a 
degraded parallel Gaussian broadcast channel. Furthermore, the secrecy capacity of the enhanced 
channel is the same as the original channel, as the noise variances for the legitimate receiver did 
not change at all for any of the "active" subchannels. Therefore, at least for the special case of the 
parallel Gaussian wiretap channel, an enhanced channel does always exist. 

Carrying over to the general vector Gaussian wiretap channel, no information should be trans- 
mitted along any direction where the eavesdropper observes a stronger signal than the legitimate 
receiver. The effective channel for the eavesdropper is thus a degraded version of the effective chan- 
nel for the legitimate receiver. (This observation was also made by Khisti and Wornell [1].) This 
is the basis underlying the existence of the enhanced channel for a general vector Gaussian wiretap 
channel. 

Note that in characterizing the capacity region of the vector Gaussian broadcast channel (a 
canonical model for the multi-antenna broadcast channel), Weingarten et al. [12] enhanced each 
and every channel (by reducing the noise covariance matrices) from the transmitter to the receivers. 
In our argument, however, we only enhanced the channel for the legitimate receiver. (The channel 
for the eavesdropper did not change at all). This is due to the fact that in both arguments, the 
enhancement, a priori, must increase the capacity (secrecy or regular) of the channel. (Otherwise, 
both arguments will break down.) Whereas reducing the noise covariances will benefit all the 
receivers and hence improve the capacity of the vector Gaussian broadcast channel, reducing the 
noise covariance matrix of the eavesdropper may compromise the security of the transmission scheme 
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and hence lower the secrecy capacity of the vector Gaussian wiretap channel. This is the key 
difference between the channel enhancement argument here and that of Weingarten et al. [12] for 
the vector Gaussian broadcast channel. 



A Proof of Inequality ( fTTf ) 



To prove inequality (fTTl) . it is equivalent to show that ~ A/'(0, K*) is an optimal solution to the 
optimization problem 

max [/i(X + W^) - /i(X + W^)] 

/(X): £;[XX*]^S 

which would handle the Gaussianity and the covariance matrix issues in one shot. For that purpose, 
we shall prove that fi'(X) < fi'(X^) where 

£/(X) := /i(X + W^) - /i(X + We) (26) 

for any X such that E[XX^] ^ S. 

For any X such that E[XX^] ^ S and any A G [0, 1], let 

Xa := Vl - AX + v^X^ (27) 
where we assume that X and X^ are independent. By the de-Bruijn identity [5l Cha. 16.6], 



dX 2(1 - A) 



Tr ((K: + K,) J(Xa + W,) - (K^ + K^) J(Xa + W^)) (2^ 



where J(X) denotes the Fisher information matrix of X. Recalling the vector Fisher information 
inequality [HI Lemma 1] 

J(Xi + X2) ^ AJ(Xi) A* + (I - A) J(X2) (I - A)* (29) 

for two independent random vectors Xi and X2 and letting 

A=(K: + Ke)"^(K: + K,), (30) 

we have 

J(Xa + W,) ^ A-i(J(XA + We)-(I-A)J(W)(I-A)*)A-* 

= (k: + k,)"1((k: + Ke)j(x, + We)(K: + k^) - (k^ - k,))(k: + k,)-^ (31) 
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where W is A/'(0,Ke - K^) and is independent of (W^,X,X^). Substituting §^ into ([28]), we 
have 

dg{X, 



> 




1 




2 


;i- 


A) 






1 






2 


;i - 


A) 


> 




1 




2 


;i - 


A) 






1 






2 


;i - 


A) 












dX 

;Tr ((K: + K,)(J(Xa + W,)(K: + K,) - I)M2) (32) 

(33) 



(34) 

where equahties fl32l) and flMl) are due to the equations in f|T2l) . and inequahty (|33|) is due to the 
well-known Cramer-Rao inequality 

J(X) y Cov-\X) (35) 

and the fact that Cov(X) ^ £'[XX*] ^ S. That is, g{^x) is a monotonically nondecreasing function 
of A in [0, 1]. We thus have 

g{X) = g{Xo)<g{X^)=g{X*a). (36) 
This completes the proof of inequality (fTTj) . 
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